A new strategy for hit generation: Novel in cellulo active inhibitors of CYP121A1 from Mycobacterium tuberculosis via a combined X-ray crystallographic and phenotypic screening approach (XP screen)

There is a pressing need for new drugs against tuberculosis (TB) to combat the growing resistance to current antituberculars. Herein a novel strategy is described for hit generation against promising TB targets involving X-ray crystallographic screening in combination with phenotypic screening. This combined approach (XP Screen) affords both a validation of target engagement as well as determination of in cellulo activity. The utility of this method is illustrated by way of an XP Screen against CYP121A1, a cytochrome P450 enzyme from Mycobacterium tuberculosis (Mtb) championed as a validated drug discovery target. A focused screening set was synthesized and tested by such means, with several members of the set showing promising activity against Mtb strain H37Rv. One compound was observed as an X-ray hit against CYP121A1 and showed improved activity against Mtb strain H37Rv under multiple assay conditions (pan-assay activity). Data obtained during X-ray crystallographic screening were utilized in a structure-based campaign to design a limited number of analogues (less than twenty), many of which also showed pan-assay activity against Mtb strain H37Rv. These included the benzo[b][1,4]oxazine derivative (MIC90 6.25 μM), a novel hit compound suitable as a starting point for a more involved hit to lead candidate medicinal chemistry campaign.


Introduction
The pathogenic bacterium Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis (TB) for which over 10 million people globally were diagnosed and treated in 2019 and which was responsible for over 1.4 million deaths [1]. Antibiotic resistant strains of Mtb constitute a major worldwide healthcare problem, ensuring the continued demand for the development of new anti-TB drugs, particularly those that work via novel mechanisms. The Mtb genome encodes twenty cytochrome P450 enzymes (CYPs), an unusually large number given its relatively small size (4.41 Mb pairs), representing a CYP gene density over 240-fold greater than the human genome [2,3]. The preponderance of P450s in the Mtb genome, indicating a heavy reliance by the bacterium (which primarily exploits highly perfused tissues) on this class of monooxygenase enzymes, has resulted in Mtb CYPs being highlighted as viable targets for new anti-TB therapeutics. CYP121A1 (mycocyclosin synthase; EC 1.14.19.70; Rv2276), functioning downstream of the cyclo(L-tyrosyl-L-tyrosyl) (cYY) synthase (EC 2.3.2.21; Rv2275), catalyzes the conversion in Mtb of the cyclic dipeptide cYY into the highly strained biaryl-containing natural product mycocyclosin (Fig. 1) [4e6]. Whilst the precise function of mycocyclosin remains unknown, ablation of CYP121A1 expression via genetic knockouts showed the enzyme to be essential for the viability of the bacterium in vitro; CYP121A1 is thus seen as an important target for anti-TB drug research [7]. Herein the use of a combination of X-ray crystallographic screening against CYP121A1 and phenotypic screening of whole-cell Mtb is described as a novel approach in hit discovery (XP Screen). This approach is used to highlight a series of compounds with molecular properties suitable for further elaboration into more potent anti-TB agents.
As part of continuing efforts aimed at the discovery of novel inhibitors of CYP121A1 [8e10], the binding mode of a novel class of inhibitors with moderate activity against Mtb strain H37Rv was reinvestigated. Previous work had shown that a series of D-tryptophan derived thiazoles exhibited moderate potency (K d~3 0e55 mM) against CYP121A1 in a UVeVis spectrophotometric assay [10]. Some members of the series (Fig. 2, compounds 1e3) also showed moderate activities (MIC 90~4 0e100 mM) against Mtb in whole-cell assays. Unfortunately, X-ray crystallographic structures of CYP121A1 in complex with compounds of this series could not be generated, which hampered our understanding of their binding modes and the rational design of more potent derivatives. With these factors in mind a more thorough X-ray crystallographic screen of compounds akin to 1e3 was performed, to discover novel and simplified analogues with binding modes that were well understood, and were suitable starting points for a structure-based design campaign. The aim was also to couple this approach to phenotypic screening against Mtb, in order to ensure that the previously noted in cellulo activity against the bacterium was maintained, despite the chemical changes being made to the series. CYP121A1 in complex with a wide range of fragments and inhibitors has been previously reported [8,9]. The CYP121A1 binding pocket has been shown by X-ray crystallography to accommodate ligands in three distinct regions (i) binding to the heme group, (ii) binding in the cYY region and (iii) binding at the top of the binding pocket. All the compounds shown binding in these regions have been observed using high resolution crystal structures and have been characterised using biophysical techniques such as DSF and ITC.

Results and discussion
A focused screening set of compounds was synthesized (Fig. 2, compounds 4e37) containing functionalities common to a number of the previous key compounds [10]. The structures were simplified by removing the chiral amino functionality, as previous studies had shown the precise stereochemistry to be unimportant in terms of affinity. In addition, the thiazole ring was replaced with a pyridine (4 and 7) and pyrimidine (10) rings, both in order to improve solubility and to allow the more rapid generation of diverse analogues. Halogens were added at key positions within these compounds (5e6, 8e9 and 11e13) in order to potentially aid crystallographic interpretation of ligand binding modes; a similar concept (Fra-gLites) has been reported recently [11] in the area of fragmentbased discovery. The aim was also to reduce compound lipophilicity whilst retaining (as much as possible) overall compound topology. This was achieved by replacing unsaturated aromatic sidechains with achiral saturated ring systems containing solubilizing or other heavier 'X-ray friendly' heteroatoms (such as sulfur) (14e33). A small number of comparable direct aromatic analogues (34e37) were also included.
Compounds 4e9 were prepared by heating the appropriately substituted indole with either 2-vinylpyridine (for 4e6) or 4vinylpyridine (for 7e9) in acetic acid as described previously [12e15].
Similar reactions [13] between 2-chloro-4vinylpyrimidine [16] in a mixture of acetic acid and 1,4-dioxane afforded 11e13. Hydrogenation of 11 over palladium on carbon in ethanol in the presence of triethylamine yielded the known pyrimidine 10 [13]. Treatment of 11e13 with the appropriate secondary amine (or amine hydrochloride salt) in hot ethanol (with added anhydrous sodium carbonate when using an amine hydrochloride) gave 14e25 and 27e33 (Scheme 1). Heating tert-butyl carbamate 25 with hydrogen chloride in diethyl ether and methanol afforded the amine hydrochloride 26. Treatment of chloride 11 with 4-trifluoromethoxyphenylboronic acid and catalytic palladium(0) afforded biaryl derivative 34. Chlorination of 34 with Nchlorosuccinimide in hot tetrahydrofuran gave 35. Bromination of 34 with N-bromosuccinimide to prepare 36 was more facile and proceeded at room temperature in dichloromethane. Palladium(0) catalyzed arylation of 11 with 2-pyridylboronic acid N-methyliminodiacetic acid (MIDA) ester according to published conditions [17] afforded the 2-pyridyl derivative 37 (isolated as the hydrochloride salt after purification via formation of the tert-butyl carbamate 38 and subsequent acid catalyzed deprotection).
Compounds 4e37 were screened crystallographically against CYP121A1 using soaking experiments. A number of hits (7, 10, 14, 21, 31 and 33) were identified, although the completeness of the electron density for the ligands was variable. All six of the X-ray crystallographic hits were ligands containing 7 as an integral subunit of molecular structure (Fig. 3). 4-Pyridyl derivative 7 has previously been used successfully [18] as a starting point for a structure-based design campaign against p38a MAP kinase, leading to inhibitors with low nanomolar affinity against the enzyme. It was noted that only certain substituents located at either end of the core structure (7) had proven successful crystallographically. In particular, substitution at C-2 of the pyrimidine ring appeared to favor both the morpholine and the constitutionally isomeric 3methoxyazetidine groups. Three structures (with 7, 31 and 33) contained either incomplete density into which to place the ligand (indicating high mobility), or density suggestive that the ligand might bind in a variety of possible orientations; these three were therefore discarded as unsuitable as starting points from which to generate more potent compounds (although data obtained from them was utilized for guidance in subsequent structure-based design efforts). X-ray crystal structures of compounds 10 and 21 with CYP121A1, whilst of better quality, still suffered from incomplete ligand electron density, either lacking complete density for the indole moiety (for 10) or lacking density for the central linker (for 21) (Fig. S1). In sharp contrast, the X-ray crystallographic structure for 14 included complete contiguous difference density into which the ligand could be fitted clearly and unequivocally (Fig. 4). Whilst morpholines were shown to bind to the enzyme, a number of morpholine replacements, varying by as little as a single heavy atom (piperazine and thiomorpholine), or with only one additional or deleted heavy atom (N-methylpiperazine and isoxazolidine), were all crystallographically ineffective. Similarly, none of the 2-aryl analogues 34e37 were successful structurally in this screen. This approach is thus an effective method by which to utilize the inherent subtlety of molecular binding events to effectively triage a range of extremely similar compounds within a moderately sized screening set.
Concurrent with the X-ray crystallographic screen, a limited number of the compounds (15, 23, 30, 32 and 34) were also screened against whole-cell Mtb strain H37Rv in a panel of three different growth media, including one containing cholesterol as the sole carbon source (in an attempt to recapitulate more accurately the in vivo environment). Four of these compounds (15, 23, 32 and 34) showed promising in cellulo activity (MIC 90~1 2.5e50 mM) against the bacterium in one or more of the growth media tested ( Table 1). The combined data suggested that it might be reasonably expect to find whole-cell activity in a variety of close analogues within the series, and that there was a possibility of generating high quality X-ray crystal hits during subsequent rounds of compound optimization.
Compounds 4e37 were screened for their affinity against CYP121A1 using UVeVisible spectroscopy (UVeVis) which involves monitoring the shift of the Soret band at 416.5 nm. Only the three 4pyridyl compounds 7e9 showed any discernible effect, with all other derivatives showing no Soret band shift ( Table 2). The lack of shift for 14 was expected given the fact that the compound had been shown to bind in a region of the protein that is distal to the heme. Presumably 7e9 interact with the heme iron via the pyridyl nitrogen atom, whilst steric encumbrance of the corresponding 2pyridyl isomers 4e6 prevents them from binding in a similar fashion. Chloride 8 and bromide 9 show good levels of affinity (K d : 5.8 mM and 3.8 mM respectively) whereas the core structure 7 showed much lower potency (K d : 480 mM). The relatively moderate potency for 7 might explain our inability to obtain high quality Xray structural data for this key compound, given its moderate lipophilicity (cLogP 2.12). The morpholine containing compound 14 is a ligand with a relatively low molecular mass and low cLogP (308 and 2.32 respectively) and as such represents a convenient starting point for elaboration. X-ray crystallography revealed that 14 binds to CYP121A1 with a number of interesting key features, chief amongst them being the fact that 14 does not engage with the iron (III) atom of the heme group, either directly or indirectly via a distally bound water molecule above the iron center. The X-ray structure of 14 indicated the presence of a hydrogen bond between the donor sidechain CONH 2 of Asn85 and the morpholine oxygen atom, which acts as the acceptor. This hydrogen bond is observed in the cYYbound structure (PDB accession code: 3G5H) [4], where the acceptor is instead one of the carbonyl oxygen atoms of the cyclic diketopiperizine core of cYY. In addition, cYY binds surrounded by two extensive water clusters [4] (Fig. 5a; water molecules in one of the two clusters are highlighted as red spheres). Overlaying of the structures containing cYY and 14 (Fig. 5b) shows that, upon binding, the indole moiety of 14 disrupts this hydrogen bonded network and ousts a large number of water molecules from this highly solvated pocket. Numerous crystal structures of CYP121A1 have been deposited in the Protein Data Bank (PDB) containing ligands binding in differing locations within the enzyme active site. Morpholine 14 binds to CYP121A1 with a novel binding mode, in which the indolyl moiety is located high up in an otherwise highly solvated pocket that lies distal to the heme group. This novelty, and the fact that this pocket is a specific feature of CYP121A1, which might confer selectivity over other CYPs for compounds that bind at this site, cumulatively made it an extremely attractive start point for further development. Gratifyingly, 14 also showed appreciable in cellulo activity (MIC 90~1 2.5e25 mM) against Mtb in all three of the growth media tested, including complete 7H9, a medium highly enriched in Bovine Serum Albumin (BSA). This pan-assay activity welcomed, as it indicated that derivatives akin to our key X-ray hit might be expected to show relatively low levels of plasma protein binding (PPB), reinforcing that 14 would be a suitable starting point for further elaboration.
In order to more fully explore activity around core structure 7, and to attempt to collect further structural data, a limited number of derivatives were synthesized (41e44). Identically substituted derivatives of the isomeric 2-pyridyl structure 4 were also prepared (39e40) by analogous means. Only the 4-pyridyl derivatives (41e44) showed any activity, with fluoride 41 [19], iodide 42 and trifluoromethyl ether 44 showing the greatest affinities (K d : 27.3 mM, 3.2 mM and 12.7 mM respectively) whereas methyl ether 43 was less active (K d : 105 mM), although still significantly more potent than 7. Unfortunately, useable structural data for 39e44 could not be obtained (44 gave a partial structure), and so further work on both the 4-pyridyl and 2-pyridyl compounds was discontinued.
Despite the inability to determine affinity data for 14 against CYP121A1 by UVeVis, we looked to develop 14 due to the quality of X-ray data and novelty of binding mode coupled with the promising in cellulo activity shown by the compound, and so investigated other biophysical methods to determine the in vitro potency. Isothermal titration calorimetry (ITC) was attempted with 14, and the data compared to those obtained in a similar experiment using a known compound 45 (Fig. 6), for which affinity had been measured previously via this method (K d : 40 mM) [9]. In these experiments, 14 showed lower affinity (K d :~100 mM) than standard 45, which showed a potency (K d : 20e25 mM) in line with that measured previously (for ITC thermograms, see SI). With 14, exotherms measured by ITC were relatively small, perhaps as a result of energy being required to disrupt the highly ordered network of waters displaced by the indolyl moiety (greater entropic   Table 1 Whole-cell data against Mtb H37Rv. a) c7H9: Middlebrook 7H9 broth (Difco) supplemented with 10% (v/v) of Albumin, Dextrose, and Catalase (ADC) enrichment , 0.05% (v/v) tyloxapol and 0.02% (v/v) glycerol. b) 7H9-Low BSA: Middlebrook 7H9 broth with low BSA supplemented with 10% (v/v) of Albumin, Dextrose, and NaCl enrichment (ADN), 0.05% (v/v) tyloxapol and 0.02% (v/v) glycerol. This medium contains 0.05% instead of 0.5% (w/v) Bovine Albumin (Fraction V). c) MMM-Ch: Mycobacterial Minimal Medium with Cholesterol: 0.5 g/L L-asparagine, 1 g/L KH 2 PO 4 , 2.5 g/L Na 2 HPO 4 , 50 mg/L ferric ammonium citrate, 0.5 g/L MgSO 4 7H 2 O, 0.5 mg/L CaCl 2 , 0.1 mg/mL ZnSO 4 Table 2 UVeVisible spectrophotometric data. Compounds were screened at a concentration of 300 mM. All compounds listed gave Type II shift in the Soret band.
component to the binding event). Recent studies have highlighted that compound binding affinities measured in bioassays can be hugely influenced by the presence or lack of key water molecules in the active site [20]. Comparisons of affinities generated for compounds by UVeVis and ITC in our laboratories against CYP121A1 have indicated that invariably those affinities from ITC are a factor of 10e50 lower than those measured by UVeVis, a prime example being ITC assay standard 45 (as well as several other members of this 3aminopyrazole based series). Additionally, ITC requires larger quantities of protein than UVeVis, and as such is not always suitable for the generation of large quantities of affinity data from multiple analogous compounds. Nevertheless, the indicative value for the binding affinity of 14 against CYP121A1 from ITC proved useful, as both the X-ray data and Mtb strain H37Rv pan-assay activity for this compound were key to our strategy for compound elaboration.
The X-ray structure for 14 was utilized in a structure-based design campaign with a view to generate analogous derivatives with improved affinity against Mtb. In this hit expansion phase, a number of novel compounds were synthesized, each of which retained all 23 heavy atoms of the hit 14. By retaining these key atoms, one could expect the resulting compounds to adopt a similar conformation upon binding to CYP121A1, with each derivative making comparable key contacts to the enzyme. Based upon the Xray structure (Fig. 7, Panel A), 14 was elaborated in five key areas , by adding a limited number of heavy atoms to the hit structure ( 5). Additionally, the specific conformation of the morpholine ring upon engagement with the sidechain of Asn85 (Fig. 7, Panel B) was factored into the design strategy.
Compounds 51e65 were prepared using methods similar to those described above (for 11e33). In view of the uncertainty of exactly how large a substituent at C-5 of the indole ring could be tolerated, a number of derivatives bearing a variety of functionalities of differing sizes were prepared. Reactions between 2-chloro-4-vinylpyrimidine [16] and the appropriately 5-substituted-1Hindole gave the fluoro, methoxy, trifluoromethoxy and benzyloxy intermediates (46e49 respectively) which were reacted with morpholine in hot ethanol to afford 51e54. Hydrogenation of 54 over palladium on activated carbon afforded the hydroxy derivative 55. Reduction of bromide 21 under an atmosphere of deuterium with added base allowed for the preparation of the specifically deuterated derivative 56 (~95% d incorporation). Substitution at N-1 on the indole ring with a methyl group was achieved by similar means affording 57 (via 50). Direct derivatization at the indole nitrogen of 14 required more forceful conditions; acetyl 64 (refluxing acetic anhydride) and methanesulphonyl 65 (MsCl with concentrated base under phase transfer conditions) [21] derivatives were prepared thus. Bridged bicyclic derivatives 58 and 59 (aimed at locking in the morpholine conformation noted above) and the ring expanded homologue 60 were prepared from the appropriate 2chloropyrimidine (11 and 46) and the corresponding secondary amine (or amine hydrochloride salt) in hot ethanol. Benzo[b][1,4] oxazines 61e63 were prepared from chlorides 11 and 46 by similar means, but required more vigorous conditions (120 C in a sealed tube or 160 C with microwave irradiation) due to the much reduced nucleophilicities of the aromatic amines.
Compounds 51e65 were screened crystallographically against CYP121A1, but unfortunately in each case no ligand density was observed, although it was seen that the lattice of resting state waters had been partially or completely disrupted. Compounds 51e65 were screened for in vitro activity against CYP121A1 using UVeVis spectroscopy. As with the initial X-ray hit 14, none of these direct analogues showed any discernible effect (no Soret band shift), suggesting that, like 14, they bind to CYP121A1 without  direct contact to the heme. Likewise, a number of the compounds were examined by ITC, but it proved difficult to generate affinity data for these molecules using ITC due to inadequate solubility in aqueous buffers. Despite this, a number of analogues of 14 were screened against Mtb (Table 1). Several derivatives (31, 51, 53, 57, 60, 61 and 63) were moderate inhibitors, and like 14, showed activity in all three of the growth media tested. Bridged bicyclic derivative 58 was active, but showed much reduced potency in BSA rich medium. N-Acetyl derivative 64 had essentially lost all activity, whereas the corresponding N-methyl derivative 57 was one of the compounds that exhibited pan-assay activity. It was noted that the most potent derivative against Mtb was 61 (MIC 90 6.25 mM), a compound that had been designed based upon the overlap of X-ray crystal structures for 14 and 31, where the ligand in 31 had been positioned in an alternative orientation, with the ligand rotated 180 in the active site, so suggesting the idea of appending an aromatic ring onto the morpholine moiety.
The difficulty of obtaining X-ray crystal structures for many of these in cellulo active compounds is perhaps not too surprising. Despite there being numerous CYP121A1 crystal structures deposited in the PDB, the majority are with ligands that either have low cLogP or make direct contact with the iron atom of the heme group. As the substrate for the enzyme (cYY) has a cLogP of 0.82, the enzyme is clearly designed to accommodate substrate and product that are highly hydrophilic in nature. Recent studies recording imidazolyl-and triazolyl-derived pyrazoles as inhibitors of CYP121A1 have shown that, whilst compounds with higher lipophilicity (cLogP >4) showed better activity in cellulo, only some of those with much lower cLogP (1.44 and 2.68) afforded useable crystal structures [22]. These opposing factors clearly constitute a significant challenge in this area, as compounds that deliver the greatest structural knowledge struggle to permeate the highly lipophilic mycolic acid derived cell wall of Mtb and so invariably show much lower in cellulo potencies.
In order to obtain a greater degree of structural confidence for this in cellulo active series of compounds, derivatives akin to 14 were prepared that might still be able to adopt similar binding modes but had significantly lower cLogP values. This was achieved by direct replacement of the pyrimidine core of 14 with either a suitably reduced pyrrolo [3,4-d]pyrimidine or pyrido [3,4-d]pyrimidine. This scaffold morphing approach entailed the relocation of a pyrimidine nitrogen to the alternative ring position where it was suitably positioned to be able to interact with the sidechain hydroxyl group of Thr77 of the protein via an additional hydrogen bond, thereby potentially gaining additional affinity (Fig. 7, Panel A). This nitrogen relocation had the added benefit of allowing the addition of another hydrophilic ring onto the structure, so greatly reducing the cLogP values of the resulting fused bicyclic compounds (to between 1.22 and 2.22).
Chlorides 66e69 were prepared from the corresponding dichlorides as described previously [23,24]. Palladium(0) catalyzed vinylation was extremely facile, and gave 70e73 in high yield under mild conditions. Unfortunately, vinyl derivatives 70e73 failed to afford addition products when treated with indoles under conditions identical to those used for the synthesis of 14 and multiple close analogues; instead the alternative vinyl boronate 74 was utilized as a coupling partner. Vinyl boronate 74 is only very briefly described [25] within the patent literature as an (E/Z)-mixture. When prepared via an alternative route over five steps from indole using known chemistry (iodination, N-Boc protection, TMSacetylene addition, fluoride induced TMS removal [26] and borane.THF catalyzed addition of pinacolborane in hot THF [27]), 74 was obtained as the (E)-isomer (olefinic coupling constant: J ¼ 19 Hz). The added length of synthesis required to generate 74 resulted in only the three derivatives 81e83 with lowest cLogP values (1.22e1.77) being targeted. Thus chlorides 66e68 were reacted with 74 (under identical conditions to those used to prepare 70e73) to afford 75e77 as the (E)-isomers (olefinic coupling constants: J ¼ 16e17 Hz). Reduction of 75e77 with hydrogen over palladium on carbon in ethyl acetate yielded 78e80 (after chromatography on silica to remove over-reduced by-products) which, upon deprotection under acidic conditions, gave 81e83 (as the hydrochloride salts) (Scheme 2).
Fused bicyclic compounds 81e83 behaved, as expected, like all of the pyrimidine derivatives in this series in that we were unable to determine the affinities for these compounds against CYP121A1 by UVeVis spectroscopy. This was reassuring, as it effectively ruled out the possibility of a change of binding mode in which interaction with the heme iron atom might conceivably occur via the cyclic secondary amine functionalities. Likewise, determination of potency by ITC proved difficult for these derivatives (in common with a number of the pyrimidines described above). Despite a significant reduction in cLogP values, it again proved impossible to collect satisfactory X-ray structures for 81e83. Nevertheless, the much improved solubility of 81e83 in aqueous media allowed the use of differential scanning fluorimetry to probe the binding affinities of 81e83 (for protein melting curves, see SI). Addition of all three bicyclic derivatives to solutions of CYP121A1 in aqueous buffer (1 mM ligand with 5 mM protein) resulted in increases to the melting temperature (Tm) of the protein (Table 3). This suggests that 81e83 bound to, and stabilized, the protein. The effect was most pronounced in the two pyrrolo [3,4-d]pyrimidine derived compounds 81 (þ3.5 C) and 82 (þ2.4 C), with the ring expanded homologue 83 (þ1.0 C) showing a lesser effect (akin to that of aminopyrimidine 45 used as positive control). The original screening set, together with the set of advanced derivatives (as outlined above), were also subsequently screened by these means. Most compounds gave inconclusive results as a result of aggregation, due to limited ligand solubility in the aqueous buffer. Compounds with lower cLogP proved to be less problematic, with both 10 (an X-ray hit) and 55 (the 5-hydroxy derivative of the key X-ray hit 14) also showing evidence of activity (protein melting temperature shifts of þ3.5 C and þ0.5 C respectively).
A panel of key compounds (7e10, 14, 21, 31, 33, 41e44, 51, 53, 55, 57, 60, 61, 63, 81e83) were tested in an LC-MS based activity assay, to determine their ability to inhibit the turnover of cYY into mycocyclosin relative to clotrimazole, a known azole derived inhibitor of CYP121A1. All compounds tested were either X-ray crystallographic hits, or had yielded encouraging data in UVeVisible spectroscopy, differential scanning fluorimetry, isothermal titration calorimetry or antimycobacterial activity assays (as described above). Under the conditions used, no CYP121A1 mediated turnover of the experimental compounds was observed, indicating that none of the compounds were substrates for Gratifyingly, in the presence of compounds (7e10, 14, 21, 31, 33, 41e44, 51, 53, 55, 57, 60, 61, 63, 81e83), a general trend towards inhibition of CYP121A1 was noted, thus confirming inhibitory activity of CYP121A1 as a general property of a number of key derivatives within the series of compounds. In particular, 14 (the original hit from the XP screen) and 61 (with pan-assay activity against whole cell Mtb) both markedly reduced the production of mycocyclosin (Fig. 8, Panel B), thus reinforcing the suitability of 61 as a starting point for further development. Given their respective positions within the medicinal chemistry campaign, the propensity of compounds 14 and 61 to occlude the active site, and therefore prevent cYY from binding, was quantified using a UVeVisible spectrophotometric competition assay (Fig. S5). In the competition assays, compounds 14 and 61 were introduced prior to a titration with cYY; in the reverse competition assays, cYY was introduced prior to a titration with either 14 or 61. A control titration with cYY produced a K d of 17.2 ± 0.6 mM, a value akin to that reported previously [4]. An increase in the K d value for cYY in the presence of 14 or 61 would indicate that the compounds were occluding the active site of CYP121A1 to a significant enough degree, so as to preclude the simultaneous binding of cYY. Indeed, the presence of 14 appeared to perturb the binding of cYY, resulting in a K d value of 25.2 ± 1.0 mM, a 1.5-fold increase over the cYY control (Table 4). The presence of 61, however, did not appear to greatly perturb the binding of cYY, an observation that is seemingly inconsistent with the LC-MS activity and antimycobacterial assays, but might be indicative of a more limited solubility of 61 (relative to 14) under the assay conditions.

Conclusions
X-ray crystallographic screening is a frequently used technique for hit generation in the field of fragment-based drug discovery [11,28] with a noticeable trend towards the use of the technique to screen increasingly smaller compounds [29]. Recently, X-ray crystallography has been successfully explored in the absence of crystal cryo-protection [30] in order to further aid screening throughput. Such techniques, however, are unlikely to yield hit compounds showing appreciable levels of cellular activity, due to their small size. As such, a purely target-based screening approach suffers in that it is possible to undertake a substantial amount of discovery research with only a relatively limited knowledge of in cellulo or in vivo activity, with compounds being progressed that lack acceptable cell permeability, or poor ADMET properties.
Recent examples of a more typical phenotypic based screening approach against Mtb have involved open-source initiatives utilizing the diverse array of compounds residing within corporate collections to provide lead compounds. These efforts have identified a series of 2-thiophenyl morpholines [31] (from the Eli Lilly collection), novel spirocyclic derivatives [32] and pyrrolothiadiazoles [33] (both from the GSK collection). In addition, spirocycles containing 3-substituted indoles [34] (that possess high membrane permeability) have also recently been described as having excellent activity against Mtb. These approaches invariably require a more traditional medicinal chemistry approach to compound progression, from larger and more chemically advanced compounds, via Table 3 Differential scanning fluorimetry (DSF) data. Right-hand column shows the increases in protein melting temperatures. Concentrations: CYP121A1 (5 mM), compounds (1 mM).

Compound
Structure the generation of multiple rounds of structure-activity relationship (SAR) data, often with very limited knowledge of precise target engagement. Novel approaches towards new anti-TB agents in our laboratories have focused heavily on structure-based design against new targets, a recent case in point [35,36] being Mtb fumarate hydratase (fumarase) in which inhibitors with a dimeric binding mode at an allosteric site were utilized to affect Mtb viability. In the current paper we describe a novel screening approach (XP screen), that utilizes a combination of X-ray crystallographic screening of a focused compound set against a chosen target (illustrated here with CYP121A1) and phenotypic screening to ascertain in cellulo compound activity. This bidirectional approach, which addresses the problem of finding novel chemical hits from both ends of the traditional screening cascade at a very early stage, is an effective method by which to rapidly triage those compounds that are not observed crystallographically (those for which target engagement in unassured) together with those that possess only very limited in cellulo activity.
In summary, we have used an XP screen against CYP121A1, in combination with a variety of biophysical techniques, to identify novel in cellulo active inhibitors of this key target [37] in anti-TB research. Structure-based design around a key X-ray crystallographic hit 14 led to a number of close structural analogues that showed pan-assay activity against Mtb. One of these, the benzo[b] [1,4]oxazine derivative 61, is a novel lead compound of moderate molecular mass (M r ¼ 356) suitable as a starting point for a more thorough lead-to-clinical candidate phase, that inhibits the CYP121A1 mediated turnover of cYY to mycocyclosin, and shows favorable activity against Mtb strain H37Rv (MIC 90~6 .25 mM, 2.2 mg/mL) in comparison with a number of azole anti-fungal drugs (clotrimazole, econazole, miconazole: MIC 90 11, 8, and 8 mg/ mL respectively) [7,38] that also act via inhibition of CYP121A1 [7], but that suffer from poor oral bioavailability or show unacceptable toxicity profiles.

Chemistry
Petroleum ether (b.p. 40e60 C), ethyl acetate, n-hexane, dichloromethane and toluene were distilled prior to use. N-Bromosuccinimide was recrystallized from water and dried over potassium hydroxide pellets under reduced pressure in a vacuum desiccator prior to use. Column chromatography was performed on a Biotage Isolera™ Spektra One automated flash purification system using appropriately sized pre-packed silica cartridges. Nuclear magnetic resonance (NMR) spectra were recorded in DMSO-d 6 on a Bruker AVANCE III HD 400 spectrometer with a BBO SmartProbe™ operating at 399.6 MHz or 400.1 MHz (for 1 H), at 100.5 or 100.6 MHz (for 1 H decoupled 13 C) and at 376.0 MHz (for 1 H decoupled 19 F) or on a Bruker AVANCE NEO 400 spectrometer with a Prodigy BBO CryoProbe™ operating at 128.4 MHz (for 11 B) and at . Large amounts of cYY and low amounts of mycocyclosin indicate that a compound efficiently inhibits the activity of purified CYP121A1 under the conditions used. Neg: negative control; Pos: positive control; *: X-ray crystallographic hits; ¤: UVeVisible spectroscopy hits; x: antimycobacterial activity assay hits; ǂ: differential scanning fluorimetry hits; Clo: clotrimazole. A total of one experiment was performed. The negative control consisted of sample without the introduction of reducing power. The positive control consisted of a sample without the introduction of any inhibitory/experimental compounds.    (14). A mixture of 3-(2-(2-chloropyrimidin-4-yl)ethyl)-1H-indole 11 (103 mg, 0.4 mmol) and morpholine (1 mL) in ethanol (6 mL) was stirred and held at reflux for 4 h and allowed to cool to room temperature. The solvent was removed in vacuo and the residues partitioned between dichloromethane and water. The organic layer was separated, the solvent removed in vacuo and the residues subjected to column chromatography on silica. Elution with 0e100% ethyl acetate in petroleum ether (b.p. 40e60 C) afforded 4-(4-(2-(1H-indol-3-yl)ethyl)pyrimidin-2-yl)morpholine 14 4-(4-(2-(1H-Indol-3-yl)ethyl)pyrimidin-2-yl)thiomorpholine (15).

4-
(129 mg, 0.5 mmol) and N-methylpiperazine (1 mL) in ethanol (6 mL) was stirred and held at reflux for 4 h and allowed to cool to room temperature. The solvent was removed in vacuo and the residues partitioned between dichloromethane and water. The organic layer was separated, the solvent removed in vacuo and the residues triturated with a mixture of n-hexane and dichloromethane. The solids were collected by suction filtration to afford 3-     (20). Prepared as per the method for 16 from 5-chloro-3-    anhydrous sodium carbonate (212 mg, 2.0 mmol) in water (0.5 mL), ethanol (1 mL) and toluene (2 mL) was degassed with nitrogen for 5 min and stirred and held at 90 C in a sealed tube for 6 h and allowed to cool to room temperature. The mixture was diluted with ethyl acetate and water, the organic layer was separated, the solvent removed in vacuo and the residues subjected to column chromatography on silica. Elution with 0e50% ethyl acetate in petroleum ether (b.p. 40e60 C) afforded 3-

X-ray crystallography
Untagged CYP121A1 was expressed and purified to homogeneity as described above with an additional final polishing step utilizing a HiLoad Superdex 75 preparative grade 16/600 (GE Healthcare, UK). The column was equilibrated with 10 mM Tris, 100 mM KCl, pH 7.8 at 4 C (adjusted with KOH/HCl) and CYP121A1 was eluted isocratically according to the manufacturer's instructions. Fractions with a calculated Reinheitszahl (Rz) ratio of~2 were pooled and concentrated using a Vivaspin 20 10,000 MWCO centrifugal concentrator (Sartorius, UK). A modified form of the Rz ratio can be derived as follows:

Rz ¼
Absorbance maximum at $ 416 nm Absorbance maximum at 280 nm The crystallization conditions identified previously [39] were used as a starting point for crystallization trials. Optimization screens were set up with a Dragonfly Crystal (SPT Labtech, UK) using 3.5 M (NH 4 ) 2 SO 4 (Molecular Dimensions, UK) as the precipitant and 1 M MES pH 5e6.5 (Molecular Dimensions, UK) as the buffer system in three lens microplates (SwissSci, Switzerland). Precipitant concentration and pH ranged between 1.5 and 2.5 M and 5e6.5 respectively. Sitting drops were set up with a Mosquito Crystal (SPT Labtech, UK) using 20 mg/mL CYP121A1 in 1 mL drops with a 1:1 ratio of protein to mother liquor. Plates were incubated at 4 C and crystallogenesis took between 3 and 7 days to occur. Mature crystals were approximately 1 mm in length with an arrowhead morphology. Crystals were either soaked with saturated compound solutions made up in fresh DMSO or solid compound was added directly to the drops. In both cases, 1 mL of mother liquor was removed from the reservoir and transferred to the drop containing crystals to prevent desiccation.
During soaks with DMSO-solubilized compound, 1 mL of compound solution was introduced to the reservoir and thoroughly mixed before 0.5 mL was transferred to the drop, mixed and removed. The transfer process was performed a total of three times. Soaked crystals were periodically removed (up to~6 months), preserved in Parabar 10312 (Hampton Research, US) and flash-cooled in liquid nitrogen. Crystals were irradiated at the Diamond Light Source (Didcot, UK) using the i03, i04 or i04-1 beamlines using standard collection parameters. Data were automatically indexed, integrated, scaled and merged using the xia2 dials [40] and xia2 3dii pipelines [41,42]. Data analysis was performed with Xtriage [43], molecular replacement with Phaser-MR [44] (using PDB structure 1N40 [39] as the starting model for molecular replacement) and initial refinement with Phenix refine [45]. Ligand restraints were generated with AceDRG [46]. Feature-enhanced maps [47] and POLDER maps [48] were calculated to aid in model building, which was performed using WinCoot [49]. Figures for publication were produced using PyMOL 1.3 (Schr€ odinger Inc). For data collection and refinement statistics, see SI.
Accession codes and atomic coordinates for the X-ray structures of complexes of CYP121A1 with compounds 10 (7NQM), 14 (7NQN) and 21 (7NQO) have been deposited with the RCSB Protein Data Bank (www.rcsb.org) and will be released upon publication.

UVevisible spectroscopy
Interactions of compounds with untagged CYP121A1 were analyzed by UVeVisible spectroscopy. Protein fractions with Rz ratio of~2 were used for titrations. Compound stock solutions were made up to 30 mM in fresh DMSO. Titrations with clear colorless compound solutions were performed at 28 C using either a singlebeam Cary 60 UVeVisible spectrophotometer (Agilent, UK) or a dual-beam Cary 300 Bio UVeVisible spectrophotometer (Agilent, UK) recording between 240 and 800 nm. Temperature control was achieved using a Cary single or dual cell peltier accessory (Agilent, UK) and a Julabo AWC100 recirculating cooling bath (Fisher Scientific, UK). Titrations were performed in 1 cm path length quartz cuvettes (Starna Scientific, UK) using a matched pair when necessary. A solution of CYP121A1 (4e5 mM, Soret absorbance of 0.4e0.5 AU) was used per titration in sterile-filtered 100 mM HEPES, 100 mM KCl, 0.005% Tween-20, pH 7.8 at 28 C (adjusted with KOH/ HCl) to a final volume of 1 mL. Final DMSO concentrations were kept below 1% v/v. Prior to each titration, a baseline correction and zero was performed with buffer alone. Following this, CYP12A1 was added and the cuvette was incubated for 5 min to allow for temperature equilibration. A compound-free absorbance spectrum was recorded and then small volumes of compound (typically 0.05e0.5 mL) were introduced using a Hamilton syringe fitted with a syringe guide (Hamilton, USA). Following each compound addition an absorbance spectrum was recorded and this was repeated until no further spectral changes were observed. The data were baseline corrected and the difference plot (DAbs against wavelength) was generated by subtracting the compound-free absorbance spectrum from the spectra collected after each addition of compound. From this plot, the absorbance maximum (A peak ) and minimum (A trough ) were identified. Changes in absorbance due to compounds (DDAbs) were calculated by subtracting the A trough value from the A peak value for each spectrum recorded. Values of DAbs were plotted against compound concentration (in mM) and the data were fitted using the Michaelis-Menten equation (see below) to derive the K d values (compound concentration at which DDAbs ¼ 1 2 A max ). Analysis was performed in Microsoft Excel 2010 (Microsoft, USA) and OriginPro 9.1 (OriginLab, USA). Figures were generated using OriginPro 9.1 and CorelDRAW X7 (Corel, Canada).
In the Michaelis-Menten equation, DDAbs is the observed change in absorbance for each compound addition (in AU), A max is the change in heme absorbance at apparent compound saturation (in AU), C is the compound concentration (in mM) and K d is the dissociation constant (in mM).

Isothermal titration calorimetry
ITC experiments to measure binding affinity of ligands to Mtb CYP121A1 were performed on a MicroCal Auto-iTC200 system (Malvern Instruments, UK) at 25 C. Ligands were initially prepared as 25 mM stock solutions in DMSO-d 6 (and further diluted into DMSO-d 6 when necessary). Both ligands and CYP121A1 were diluted into identical buffer (50 mM Tris-HCl, 1 mM EDTA, pH 7.2) to generate mixtures containing 10% (v/v) DMSO-d 6 , a final ligand concentration of either 0.5 or 2.5 mM (depending on ligand solubility) and a CYP121A1 concentration of 50 mM. Ligand titrations consisted of a small (0.2 mL) initial injection (that was discarded during data processing) followed by nineteen further injections (each of 2 mL) at 120 s intervals. Control titrations were also performed (adding ligand into buffer in the absence of protein) to measure any heats of dilution or buffer mismatch and were subtracted from ligand titrations during data processing. Titration isotherms were integrated to afford the enthalpy change of each injection and were plotted against the molar ratio of added ligand. Titrations were fitted using a one-site binding model using Origin Analysis Software by setting the stoichiometry (N) to one (for weak binding compounds) or allowing stoichiometry to vary (for more potent compounds).

Differential scanning fluorimetry
DSF was performed using a Bio-Rad CFX Connect system (Bio-Rad, UK), scanning from 25 C to 95 C in 0.5 C increments each of 30 s duration. Samples were run in 96-well plates, with each well containing a final volume of 25 mL. Screening was conducted in 100 mM potassium phosphate pH 6.9, 2.5 Â Sypro Orange, and with 5 mM Mtb CYP121A1 containing either 4% (v/v) DMSO-d 6 or 4% (v/v) of 25 mM stock solutions of ligands in DMSO-d 6 (final ligand concentration 1 mM). Experiments for samples giving a positive shift in protein melting temperature were repeated in triplicate under identical conditions and the collected data were averaged (over the four runs).

Liquid chromatographyemass spectrometry (LCeMS) activity assay
Untagged CYP121A1 was expressed and purified to homogeneity as previously reported [7,38]. Escherichia coli flavodoxin NADP þ oxidoreductase (FLDR) was expressed and purified as previously reported [50,51]. Spinacia oleracea ferredoxin (FDX), Leuconostoc mesenteroides glucose-6-phosphate dehydrogenase (G6PDH), glucose-6-phosphate (G6P) and NADPH were purchased from SigmaeAldrich (UK). Reaction mixtures comprising of 5 mM CYP121A1, 4.6 mM FLDR, 10 mM FDX, 2 units of G6PDH, 10 mM G6P and 2 mM NADPH were run in 50 mM Tris-base, 150 mM KCl, pH 7.6 at 28 C in a final volume of 500 mL. Reactions were run in amber, silanized glass vials (Agilent, UK). Positive and negative controls were prepared containing either 100 mM cYY (Ambinter, France) or with none respectively. For experimental samples, either 100 mM of compound or 100 mM compound and 100 mM cYY were used. The amount of DMSO was normalized across all samples to 1.6% v/v. For the positive control and experimental samples, compound and/or cYY was introduced prior to CYP121A1 to allow for proper equilibration. Reactions were started with the introduction of a mastermix comprising G6PDH, G6P and NADPH. Samples were incubated at 28 C and 220 rpm for 2 h before reactions were stopped by the addition of 1 mL DCM via glass pipette. Samples were then briefly vortexed and then centrifuged for 10 min (466Âg), the resulting lower organic phase was extracted using a glass pipette. This process was repeated once more, the organic phases were pooled and dried overnight in a fumehood. Following this, samples were further dried for 20 min in an EZ-2 centrifugal evaporator (Genevac, UK) set to aqueous mode with the lamp off.
Samples were resuspended in 150 mL ACN (Sigma-Aldrich, UK) supplemented with 0.1% formic acid (Sigma-Aldrich, UK). LCeMS analysis was performed on an Agilent 1290 uHPLC system coupled to an Agilent 6545XT LC-QTOF controlled by MassHunter 10 (Agilent, UK). Columns used were either an EclipsePlus C18 RRHD 1.8 mm 2.1 mm Â 150 mm (Agilent, UK) or a BonusRP RRHD 1.8 mm 2.1 mm Â 50 mm (Agilent, UK) eluting at 0.45 mL/min at 60 C. Mobile phases were either water supplemented with 0.1% formic acid (A) or acetonitrile supplemented with 0.1% formic acid (B). A gradient of 3e40% B over 6 min was used for separation. Signal acquisition was achieved using MS1 mode scanning from 100 to 3000 Da at 4 Hz. Data were analyzed on MassHunter Qualitative 10 and Quantitative 10 (Agilent, UK).

UVevisible spectrophotometric competition assay
Compound stock solutions of both 14 and 61 at 100 mM and a 30 mM cYY stock solution were prepared in fresh DMSO for competition assays. A titration with cYY alone was performed as a control using a stock solution at 15 mM prepared in fresh DMSO. Assays were performed on a dual-beam Cary 300 UVeVisible spectrophotometer (Agilent, UK). Following introduction of 5 mM CYP121A1, 100 mM of either compound 14 or 61 was introduced in two equal additions and spectra recorded. The second spectrum taken was used as the starting point for a titration with cYY. Reverse competition assays were performed using a spectrum with 100 mM cYY present as the starting point for a titration with compound 14 or 61. A 100 mM cYY stock solution and 30 mM compound stock solutions were prepared in fresh DMSO for the reverse competition assay. Final DMSO concentrations were kept below 1% v/v. Titrations were performed and data processed as previously described. Data were fit to the Hill equation to derive the K d values (compound concentration at which DDAbs ¼ 1 2 A max ), as shown below: In the Hill equation, DDAbs refers to the observed absorption difference at each compound addition (in AU), A max is the change in heme absorbance at apparent compound saturation (in AU), C is the compound concentration used (in mM), n is the number of cooperative binding sites and K d is the dissociation constant (in mM).

Dedication
One of the authors, Professor Chris Abell, died suddenly during the preparation of this manuscript. His fellow authors wish to dedicate this paper to his memory.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.